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Greedy  Type  Algorithms  in  Banach  Spaces  and  applications1 


V.N.Temlyakov 

Department  of  Mathematics,  University  of  South  Carolina,  Columbia,  SC  29208 

Abstract.  We  continue  to  study  efficiency  of  approximation  and  convergence 
of  greedy  type  algorithms  in  uniformly  smooth  Banach  spaces.  Two  greedy  type 
approximation  methods  the  Weak  Chebyshev  Greedy  Algorithm  (WCGA)  and  the 
Weak  Relaxed  Greedy  Algorithm  (WRGA)  have  been  introduced  and  studied  in 
[Tl],  These  methods  (WCGA  and  WRGA)  are  very  general  approximation  methods 
that  work  well  in  an  arbitrary  uniformly  smooth  Banach  space  X  for  any  dictionary 
T>.  It  turned  out  that  these  general  approximation  methods  are  also  very  good  for 
specific  dictionaries.  It  has  been  observed  in  [DKT]  that  the  WCGA  and  WRGA 
provide  constructive  methods  in  m- term  trigonometric  approximation  in  Lp,  p  £ 

[2,  oo)  which  realize  optimal  rate  of  m- term  approximation  for  different  function 
classes.  In  [T2]  the  WCGA  and  WRGA  have  been  used  in  constructing  deterministic 
cubature  formulas  for  a  wide  variety  of  function  classes  with  error  estimates  similar  to 
those  for  the  Monte  Carlo  Method.  The  WCGA  and  WRGA  can  be  considered  as  a 
constructive  deterministic  alternative  to  (substitute  for)  some  powerful  probabilistic 
methods.  This  observation  encourages  us  to  continue  thorough  study  of  the  WCGA 
and  WRGA. 

In  this  paper  we  study  modifications  of  the  WCGA  and  WRGA  that  are  motivated 
by  numerical  applications.  In  these  modifications  we  allow  to  perform  steps  of  the 
WCGA  (or  WRGA)  approximately  with  some  controlled  errors.  We  prove  that  the 
modified  versions  of  the  WCGA  and  WRGA  perform  as  well  as  the  WCGA  and 
WRGA. 

We  give  two  applications  of  greedy  type  algorithms.  First,  we  use  them  to  provide 
a  constructive  proof  of  optimal  estimates  for  best  m- term  trigonometric  approxima¬ 
tion  in  the  uniform  norm.  Second,  we  use  them  to  construct  deterministic  sets  of 
points  {£■*■,  •  •  •  ,£m}  C  [0,  l]d  with  the  Lp  discrepancy  less  than  Cp1/2™-1/2,  C  is  an 
effective  absolute  constant. 


1.  Introduction 

The  purpose  of  this  paper  is  to  continue  investigations  of  nonlinear  to- term  ap¬ 
proximation.  We  concentrate  here  on  studying  to- term  approximation  with  regard 
to  redundant  dictionaries  in  Banach  spaces.  This  paper  is  based  on  the  paper  [Tl] 
which  in  turn  is  a  combination  of  ideas  and  methods  developed  for  Banach  spaces 
in  a  fundamental  paper  [DGDS]  with  the  approach  used  in  [T3]  in  the  case  of 
Hilbert  spaces.  The  papers  [DGDS]  and  [T3]  contain  detailed  historical  remarks 
and  we  refer  the  reader  to  those  papers.  Two  greedy  type  approximation  methods 
the  Weak  Chebyshev  Greedy  Algorithm  (WCGA)  and  the  Weak  Relaxed  Greedy 
Algorithm  (WRGA)  have  been  introduced  and  studied  in  [Tl].  These  methods 

1This  research  was  supported  by  the  National  Science  Foundation  Grant  DMS  0200187  and 
by  ONR  Grant  N00014-91-J1343 
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(WCGA  and  WRGA)  are  very  general  approximation  methods  that  work  well  in 
an  arbitrary  uniformly  smooth  Banach  space  X  for  any  dictionary  D  (see  below). 
Surprizingly,  it  turned  out  that  these  general  approximation  methods  are  also  very 
good  for  specific  dictionaries.  It  has  been  observed  in  [DKT]  that  the  WCGA  and 
WRGA  provide  constructive  methods  in  to- term  trigonometric  approximation  in 
Lp,  p  G  [2,  oo )  which  realize  optimal  rate  of  to- term  approximation  for  different 
function  classes.  In  [T2]  the  WCGA  and  WRGA  have  been  used  in  constructing 
deterministic  cubature  formulas  for  a  wide  variety  of  function  classes  with  error 
estimates  similar  to  those  for  the  Monte  Carlo  Method.  It  looks  like  the  WCGA 
and  WRGA  can  be  considered  as  a  constructive  deterministic  alternative  to  (sub¬ 
stitute  for)  some  powerful  probabilistic  methods.  This  observation  encourages  us 
to  continue  thorough  study  of  the  WCGA  and  WRGA. 

In  Sections  2  and  3  we  study  modifications  of  the  WCGA  and  WRGA  that  are 
motivated  by  numerical  applications.  In  these  modifications  we  allow  to  perform 
steps  of  the  WCGA  (or  WRGA)  approximately  with  some  controlled  errors.  We 
prove  that  the  modified  versions  of  the  WCGA  and  WRGA  perform  as  well  as  the 
WCGA  and  WRGA. 

In  Section  4  we  use  the  WCGA  and  WRGA  to  build  a  constructive  method  for 
TO-term  trigonometric  approximation  in  the  uniform  norm.  It  is  known  that  the  case 
of  approximating  by  to- term  trigonometric  polynomials  in  the  uniform  norm  is  the 
most  difficult.  We  note  that  in  the  case  of  Lp-norms  with  p  <  oo  the  corresponding 
constructive  method  has  been  provided  in  [DKT]. 

In  Section  5  we  study  a  slight  modification  of  incremental  type  algorithm  from 
[DGDS].  We  apply  that  algorithm  for  constructing  deterministic  sets  of  points  with 
small  Lp  discrepancy  and  also  with  small  symmetrized  Lp  discrepancy. 

Let  X  be  a  Banach  space  with  norm  1 1  •  1 1 .  We  say  that  a  set  of  elements  (functions) 
V  from  X  is  a  dictionary  if  each  g  G  T>  has  norm  less  than  or  equal  to  one  (||g||  <  1), 

g  G  V  implies  —  g  G  V, 

and  spam V  =  X.  We  note  that  in  [Tl]  we  required  in  the  definition  of  a  dictionary 
normalization  of  its  elements  (||g||  =  1).  However,  it  is  easy  to  check  that  the 
arguments  from  [Tl]  work  under  assumption  ||g||  <  1  instead  of  ||g||  =  1.  In 
applications  in  Section  5  it  will  be  more  convenient  for  us  to  have  an  assumption 
1 1  <7 1 1  <  1  than  normalization  of  a  dictionary. 

We  will  study  in  this  paper  two  types  of  greedy  algorithms  with  regard  to  V. 
For  an  element  /  G  X  we  denote  by  Ff  a  norming  (peak)  functional  for  /: 

||F>||  =  1,  Ff(f)  =  \\f\\. 

The  existence  of  such  a  functional  is  guaranteed  by  Hahn-Banach  theorem.  Let 
t  :=  {tk}<kL1  be  a  given  sequence  of  nonnegative  numbers  G  <  1,  k  =  1, ... .  We 
define  first  (see  [Tl])  the  Weak  Chebyshev  Greedy  Algorithm  (WCGA)  that  is  a 
generalization  for  Banach  spaces  of  Weak  Orthogonal  Greedy  Algorithm  defined 
and  studied  in  [T3]  (see  also  [DT2]  for  Orthogonal  Greedy  Algorithm). 
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Weak  Chebyshev  Greedy  Algorithm  (WCGA).  We  define  :=  /q’t  :=  /. 
Then  for  each  m  >  1  we  inductively  define 
!)•  Vcm  :=  V™  e  V  is  any  satisfying 

>tm  sup  Ff^g). 

gGV 

2) .  Define 

:=  §Tm  :=  span{^}™=1, 

and  define  G ^  :=  Gc£  to  be  the  best  approximant  to  /  from  $m. 

3) .  Denote 

/c  . . n  sic 

m  *  Jm  *  J  '-Tra* 

We  study  here  the  following  modification  of  the  WCGA.  Let  three  sequences 
r  =  {tk}fz Li,  d  =  V  =  {%}£! i  of  numbers  from  [0, 1]  be  given. 

Approximate  Weak  Chebyshev  Greedy  Algorithm  (AWCGA).  We  define 
fo  '■=  /o’  ,V  '■=  f  ■  Then  for  each  m  >  1  we  inductively  define 

1).  Fm_ i  is  a  functional  with  properties 

1 1 Fm—  1 1 1  <  1,  Tm_i(/m_i)  >  1 1  f m—  1 1 1  (1  —  fim- 1); 
and  ipm  :=  £  F>  is  any  satisfying 

gGV 


2).  Define 
and  denote 


:=  span- [(Pj}™=1, 
Em(f)  :=  inf  ||/-<p||- 


Let  Gm  G  be  such  that 


||/-Gm||  <Em(/)(l+77m). 


3).  Denote 

f  —  —  f  —  n 

Jm  •  ,7  m  *  «/ 

The  term  approximate  in  this  definition  means  that  we  use  a  functional  Tm_  i 
that  is  an  approximation  to  the  norming  (peak)  functional  Ffm  l  and  also  we  use 
an  approximant  Gm  G  which  satisfies  a  weaker  assumption  than  being  best 
approximant  of  /  from  T  rn . 

The  following  Weak  Relaxed  Greedy  Algorithm  has  been  studied  in  [Tl] . 
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Weak  Relaxed  Greedy  Algorithm  (WRGA).  We  define  /q  :=  //’r  :=  /  and 
G o  :=  GqT  :=  0.  Then  for  each  m  >  1  we  inductively  define 
!)•  <-Prm  '■=  <-P™  e  V  is  any  satisfying 

Frm_Mrm  -  Grm-l)  >  tm  sup  Ff^_Sg  ~  °rm- l)- 


2).  Find  0  <  Am  <  1  such  that 

11/  -  ((1  -  \m)G7m_1  +  \mVrm)W  =  inf  11/  -  ((1  -  +  \iprm)\\ 

U  \A\1 


and  define 


wrm  .  '“*r tyi 


(1  —  A  m)G 


r 

m  l 


+  A  m<P 


r 

m* 


3).  Denote 


/r  . rr,r  . r  /h 

m  *  Jm  *  J  ■ 


r 

m* 


We  will  study  here  the  following  approximate  version  of  the  WRGA. 

Approximate  Weak  Relaxed  Greedy  Algorithm  (AWRGA).  We  define 
fSr  '■=  /“r,r,<5,,?  :=  /  and  G Qr  :=  GQr,T,S,T1  :=  0.  Then  for  each  m  >  1  we  inductively 
define 

1).  F^_  1  is  a  functional  with  properties 


II  Far 

II  m— 


1 


<i, 


:=  G  D  is  any  satisfying 


C-i(C  -  G 


ar 

m  l 


)  >  tm  SUp  F%_  i 
g£V 


(9-GZ- 1)- 


2).  Find  0  <  Am  <  1  such  that 

||/-((l-Am)G-_1  +  AmOII< 

min(||/“Gill,  inf  11/  -  ((1  -  +  A^)||(l  +r^)) 

U  \  A  \  1 

and  define 

Gar  . /oar,r,(5,r 7  . /-i  \  \/^iar  ,  \  ar 

m  *  '-Tra  *  \ ^  m  l  '  '^m^Pm  * 

3).  Denote 

/ar  . i*ar,T,5,7]  . r  s~iar 

m  *  Jm  *  J  ^m  * 

We  study  in  Sections  2  and  3  the  questions  of  convergence  and  the  rate  of  con¬ 
vergence  for  the  two  methods  of  approximation  AWCGA  and  AWRGA.  It  is  clear 
that  in  the  case  of  AWRGA  the  assumption  that  /  belongs  to  the  closure  of  convex 
hull  of  V  is  natural.  We  denote  the  closure  of  convex  hull  of  V  by  A\(V).  It  has 
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been  proven  in  [T3]  that  in  the  case  of  Hilbert  space  both  algorithms  WCGA  and 
WRGA  give  the  approximation  error  for  the  class  Ai(T>)  of  the  order 

m 

d+E^r1/2- 

k=  1 

We  consider  here  approximation  in  uniformly  smooth  Banach  spaces.  For  a  Banach 
space  X  we  define  the  modulus  of  smoothness 

p(u)  :=  sup  {^-(\\x  +  uy\\  +  \\x-uy\\)-l). 

A  uniformly  smooth  Banach  space  is  one  with  the  property 

lim  p(u)/u  =  0. 

M— ^  0 

It  is  easy  to  see  that  for  any  Banach  space  X  its  modulus  of  smoothness  p(u)  is  an 
even  convex  function  satisfying  the  inequalities 

(1.1)  max(0,  u  —  1)  <  p(u)  <  u,  u  E  (0,  oo). 

It  has  been  established  in  [DGDS]  that  the  approximation  error  of  an  algorithm 
analogous  to  our  WRGA  with  G  =  1,  k  =  1,2,...,  for  the  class  A\(T>)  can  be 
expressed  in  terms  of  modulus  of  smoothness  of  Banach  space.  Namely,  if  modulus 
of  smoothness  p  of  X  satisfies  the  inequality  p(u )  <  7 uq,  q  >  1,  then  the  error 
is  of  0(m1/g~1).  We  proved  in  [Tl]  that  both  algorithms  WCGA  and  WRGA 
provide  approximation  for  the  class  A\(T>)  in  a  Banach  space  X  with  modulus  of 
smoothness  p(u)  <  7 uq,  1  <  q  <  2,  of  order 

m 

(1.2)  (1+E<!r'",  P:=rvr- 

k= 1  ^ 

It  also  has  been  proved  in  [Tl]  that  WCGA  converges  for  any  /  e  X  and  WRGA 
converges  for  any  /  G  A\(V)  if  r  satisfies  the  condition 


(1.3)  ^  tmZmipi  r,  6)  =  00. 

m= 1 

The  sequences  {£m(p,  r,  6)}  are  defined  in  Definition  2.1  of  Section  2.  In  a  particular 
case  of  p(u)  x  uq,  1  <  q  <  2,  the  relation  (1.3)  is  equivalent  to 


m 


(1.4) 
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In  [Tl]  we  gave  an  example  which  showed  that  (1.4)  is  sharp  for  Banach  spaces 
with  modulus  of  smoothness  of  power  type  q. 

It  is  well  known  (see  for  instance  [DGDS],  Lemma  B.l)  that  in  the  case  X  =  Lp, 
1  <  p  <  oo  we  have 

1<P<2, 

2  <  p  <  oo. 

It  is  also  known  (see  [LT],  p.63)  that  for  any  X  with  dim  A"  =  oo  one  has 

p(u)  >  (1  +  u2)1/2  -  1 

and  for  every  X ,  dim  A  >  2, 


(1.5) 


/>(«)  < 


up/p  if 

(p  —  l)u2 /2  if 


p{u)  >  Cu 2,  C  >  0. 


This  limits  power  type  modulus  of  smoothness  of  nontrivial  Banach  spaces  to  the 
case  1  <  q  <  2. 

We  prove  in  Sections  2  and  3  that  under  some  reasonable  assumptions  on  se¬ 
quences  8  and  77  the  AWCGA  and  AWRGA  are  as  good  as  the  corresponding  WCGA 
and  WRGA.  As  an  example  we  formulate  here  only  one  result  (see  Corollary  2.3  in 
Section  2  below). 

Theorem  1.1.  Let  X  be  a  uniformly  smooth  Banach  space.  Assume  that  r  = 
{t},  t  e  (0,1].  Then  for  any  two  sequences  8,  rj  E  Co  the  corresponding  AWCGA 
converges  for  any  f  E  X . 


We  remind  that  cq  is  the  space  of  all  convergent  to  0  sequences. 

In  Sections  4  and  5  we  demonstrate  power  of  the  WCGA  and  WRGA  in  classical 
areas  of  harmonic  analysis  and  numerical  integration.  The  first  problem  concerns 
the  trigonometric  m- term  approximation  in  the  uniform  norm.  Let  T(N)  be  the 
subspace  of  real  trigonometric  polynomials  of  order  N  and  let  T  be  the  real  trigono¬ 
metric  system 


1 


sinx,  cos  x,  sin  2x,  cos  2x, . . . 


Denote  for  /  G  Lp( T) 


(/ 1  • 


ci, 


inf 

i50i  v 


,</>m 


11/ 


j  M  p 


3= 1 


the  best  to- term  trigonometric  approximation  of  /  in  the  Lp-norm.  It  is  clear  that 
one  can  get  an  upper  estimate  for  er2m+i(/,  T)p  by  approximating  /  by  trigono¬ 
metric  polynomials  of  order  to.  Denote 
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The  first  result  that  indicated  an  advantage  of  to- term  approximation  over  ap¬ 
proximation  by  trigonometric  polynomials  of  order  to  is  due  to  R.S.  Ismagilov  [I] 

(1.6)  <rm(|  sin m | ,  T)oo  <  Cem~6/,5+e,  for  any  e  >  0. 

Let  us  compare  it  to  the  well  known  result  due  to  de  la  Vallee  Poussin  and  S.N. 
Bernstein 


(1.7)  Em ( |  sin  x\,T)oo  x  m  . 

V.E.  Maiorov  [M]  improved  the  estimate  (1.6): 

(1.8)  crm(|  sinx|,T)oo  -  m_3/2. 

Both  R.S.  Ismagilov  [I]  and  V.E.  Maiorov  [M]  used  constructive  methods  to  get 
their  estimates  (1.6)  and  (1.8).  V.E.  Maiorov  [M]  applied  a  number  theoretical 
method  based  on  Gaussian  sums.  The  key  point  of  that  technique  can  be  formulated 
in  terms  of  best  to- term  approximation  of  trigonometric  polynomials.  Using  the 
Gaussian  sums  one  can  prove  (constructively)  the  estimate 

(1.9)  am(t,T) oo  <  GlV3/2TO~1||t||1,  t  e  T{N). 


Denote 


N  N 

||a0/2  +  ^(afc  cos  kx  +  bk  sin  Arm) ||^4  :=  \a0\  +  ^(|afc|  +  \bk\). 

k= 1  k= 1 


We  note  that  by  simple  inequality 


||t|U<(2JV  +  i)||i||1,  teT(N), 
the  (1.9)  follows  from  the  estimate 

(1.10)  cjrn{tir)o0<C{N1'2/m)\\t\\A. 

Thus  (1.10)  is  stronger  than  (1.9).  The  following  estimate  is  known  (see  [DTI]) 

(1.11)  oo  <  Gm_1/2(ln(l  +  N/m))1/2\\t\\A. 

In  a  way  (1.11)  is  much  stronger  than  (1.10)  and  (1.9).  However,  the  existing  proof 
of  (1.11)  (see  [DTI])  is  not  constructive.  The  estimate  (1.11)  has  been  proved  in 
[DTI]  with  the  help  of  a  nonconstructive  theorem  of  Gluskin  [G].  In  Section  4  we 
give  a  constructive  proof  of  (1.11).  The  key  ingredient  of  that  proof  is  the  WCGA 
(or  WRGA).  In  the  paper  [DKT]  we  already  pointed  out  that  the  WCGA  provides 
a  constructive  proof  of  the  estimate 

(1.12)  <7m(t,T)p  <  C'(p)mr1/2||f||J4,  p  G  [2,  oo). 

The  known  proofs  (before  [DKT])  of  (1.12)  were  nonconstructive  (see  discussion  in 
[DKT,  Section  5]). 

We  formulate  here  a  general  result  from  Section  4  (see  Theorem  4.6). 
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Theorem  1.2.  Let  $  :=  {4>j}°T  1  be  a  uniformly  bounded  orthonormal  system 
defined  on  a  bounded  domain.  Assume  $  has  the  (VP)  property.  Then  there  exists 
a  constructive  algorithm  A($,  N,m)  such  that  for  any  f>  G  $(iV)  it  provides  a 
m-term  $  -polynomial  H($,  N,m)((j))  with  the  following  approximation  property 

\\<t>  -  A($,N,m)(</>) lloo  <  Cm~1/2( ln(l  +  N/m))1/2\\(f)\\A 
with  a  constant  C  which  may  depend  on  $. 

The  (VP)  property  is  a  property  that  guarantees  existence  of  a  sequence  of  the 
de  la  Vallee  Poussin  operators.  See  Section  4  for  precise  definition. 

In  Section  5  we  apply  greedy  type  algorithms  for  constructing  points  with  small 
discrepancy  and  small  symmetrized  discrepancy.  Let  1  <  p  <  oo.  We  will  define 
first  the  Lp  discrepancy  (the  Lp- star  discrepancy)  of  points  {£x,  . . .  ,  £m}  C  Lid  ■= 
[0,  l]d.  Let  X[a,b]{')  be  a  characteristic  function  of  the  interval  [a,  b\.  Denote  for 
x,y  G  Lld 

d 

B{x,y)  :=  Y[x[o,xj](yj)- 

3  = 1 

Then  the  Lp  discrepancy  of  £  :=  {£  V--,r*}c  Lid  is  defined  by 

r  1  m 

D(£,  m,  d)p  :=  ||  /  B(x,y)dy - V  B(x,  ^)|UJ,(nd)- 

m  „=i 

We  are  interested  in  £  with  small  discrepancy.  Consider 


D(m,  d)p  := 


inf  D(£,  m,  d) 


p- 


The  concept  of  discrepancy  is  a  fundamental  concept  in  numerical  integration. 
There  are  many  books  and  survey  papers  on  discrepancy  and  related  topics.  We 
will  mention  some  of  them  as  a  reference  for  the  history  of  the  subject:  [KN],  [BC], 
[Ma],  [C],  [NW],  [T2],  For  1  <  p  <  oo  the  following  relation  is  known  (see  [BC,p.5]) 

(1.13)  D(m,  d)p  x  TO_1(lnTO)i<i_1i/2 

with  constants  in  x  depending  on  p  and  d.  The  right  order  of  D(m ,  d)p  for  d  >  3 
is  unknown.  Recently,  driven  by  possible  applications  (see  [NW])  in  numerical 
integration  the  tendancy  to  control  dependence  of  D(m,d)p  on  both  variables  to 
and  d  has  appeared.  Very  interesting  results  in  this  direction  have  been  obtained 
in  [HNWW].  They  proved  the  estimate 

(1.14)  D(m,  d)oo  <  Cd1/2m-1/2. 


It  is  pointed  out  in  [HNWW]  that  (1.14)  is  only  an  existence  theorem  and  even  a 
constant  C  in  (1.14)  is  unkown.  Their  proof  is  a  probabilistic  one.  There  are  also 
some  other  estimates  in  [HNWW]  with  explicit  constants.  We  mention  one  of  them 

D(m,d) oo  <  C(d\nd)1/,2((lnn)/n)1//2 


(1.15) 
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with  an  explicit  constant  C.  The  proof  of  (1.15)  is  also  probabilistic. 

In  Section  5  we  provide  a  constructive  algorithm  which  consists  of  maximizing 
(approximately)  certain  functions  of  d  variables  at  each  step.  For  a  given  p  £  [2,  oo) 
after  m  steps  of  this  algorithm  we  obtain  a  set  £  =  j^1, . . .  ,  £m}  C  Qd  of  points 
with  small  Lp  discrepancy 


D(£,m,d)p  <  Cp1^2 3m  1/2 

with  effective  absolute  constant  C.  The  above  algorithm  is  a  greedy  type  algorithm 
which  is  a  slight  modification  of  the  corresponding  procedure  from  [DGDS].  Here 
we  do  not  assume  that  a  dictionary  D  is  symmetric:  g  £  V  implies  —g  £  V.  To 
indicate  this  we  will  use  the  notation  T>+  for  such  a  dictionary.  We  do  not  assume 
that  elements  of  a  dictionary  T>+  are  normalized  (||g||  =  1  if  g  £  D+)  we  only 
assume  that  ||g||  <  1  if  g  £  V+ .  By  Ai(T>+ )  we  denote  the  closure  of  the  convex 
hull  of  V+ .  Let  e  =  {en}^=1,  en  >  0,  n  =  1,  2, . . .  . 

Incremental  Algorithm  with  schedule  e  (IA(e)).  Let  /  £  A\(T>+).  Denote 
/o’e  :=  /  and  GqC  :=  0.  Then  for  each  m  >  1  we  inductively  define 

1.  £  T>+  is  any  satisfying 

Ff,  (*&- f)>~€m • 

J  m  —  1 

2.  Define 

<&“  :=  (1  -  1 +  <p%/m. 

3.  Denote 

fi,e  ._  f  _  fii-S 
J  m  ’  J  V-Tra  * 

Let  us  make  a  brief  comparison  of  the  above  three  types  of  greedy  algorithms. 
The  AWCGA  contains  a  step  of  finding  an  approximant  Gm  £  that  provides 
approximation  close  to  the  best  approximation.  The  corresponding  steps  of  the 
AWRGA  and  IA(e)  are  simpler:  optimization  over  A  £  [0, 1]  in  the  AWRGA  and 
simple  convex  combination  in  the  IA(e).  Next,  the  AWCGA  can  be  applied  to  any 
/  £  X.  The  AWRGA  can  be  applied  only  to  /  £  A\(V)  (in  other  words  to  /  such 
that  \\f\\Al(v)  <  !)•  ThelA(e)  can  be  applied  only  to  /  £  Ai(V+)  (\\f\\Al(v+)  =  !)• 
In  some  cases  (like  in  Section  5)  a  problem  itself  implies  ||/||^.1(x>+)  =  1.  However, 
if  the  condition  ||/||^.1(x>+)  =  1  (or  ||/||^.1(d)  <  1)  is  not  satisfied  automatically 
then  it  could  be  a  difficult  problem  to  find  ||/||^.1(x>+)  and  even  estimate  ||/||^.1(x>). 
In  such  a  case  we  would  recommend  to  use  the  AWCGA.  Clearly,  the  AWCGA  is 
the  only  option  if  ||/|Ul(x.)  =  oo. 

2.  Convergence  and  rate  of  approximation  of  AWCGA 

We  begin  this  section  with  a  known  theorem  on  convergence  of  WCGA  [Tl] .  In 
the  formulation  of  this  theorem  we  need  a  special  sequence  which  is  defined  for  a 
given  modulus  of  smoothness  p{u)  and  a  given  r  =  {tk}<^=1- 
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Definition  2.1.  Let  p(u )  be  an  even  convex  function  on  (—00,00)  with  the  prop¬ 
erty:  p( 2)  >  1  and 

lim  p(u)/u  =  0. 

u—¥  0 

For  any  r  =  {tk}kLi>  0  <  tk  <  1,  and  0  <  6  <  1/2  we  define  :=  £m(p,  r,  0)  as  a 
number  u  satisfying  the  equation 

(2.1)  p(u)  =  6tmu. 

Remark  2.1.  Assumptions  on  p(u)  imply  that  the  function 

e(u)  :=  p(u)/u,  u  7^  0,  e(0)  =  0, 

is  a  continuous  increasing  on  [0,  00)  function  with  e(2)  >  1/2.  Thus  (2.1)  has  a 
unique  solution  0  <  <  2. 

The  following  theorem  and  a  corollary  have  been  proved  in  [Tl] . 

Theorem  2.1.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u).  Assume  that  a  sequence  r  :=  {tk}kL1  satisfies  the  condition:  for 
any  6  >  0  we  have 

OO 

(2.2)  ^  '  tm£m(p,  t,  d)  —  00. 

m=l 

Then  for  any  f  e  X  we  have 

lim  ll/mTH=0. 

m—yoo 

Corollary  2.1.  Let  a  Banach  space  X  have  modulus  of  smoothness  p(u)  of  power 
type  1  <  q  <  2;  (p(u)  <  ^uq).  Assume  that 

OO 

(2.3)  Y  C  =  oo>  P= 

m=X  y 

Then  WCGA  converges  for  any  f  E  X. 

We  will  prove  the  following  theorem  for  convergence  of  the  AWCGA. 

Theorem  2.2.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u).  Assume  that  sequences  r,  5,  rj  satisfy  the  conditions:  for  any 
6  >  0  we  have 

OO 

(2.4)  Y  r,  6)  =  OO 

m= 1 

and 

(2.5)  =  o(tm£m(p,T,6))  and  r)m  =  o(fm£m(p,  r,  0)). 

Then  for  any  f  E  X  we  have 

Hm  ||/S<'’||=0. 

ra— >-o o 
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Corollary  2.2.  Let  a  Banach  space  X  have  modulus  of  smoothness  p(u )  of  power 
type  1  <  q  <  2;  (p(u)  <  7 wg).  Assume  that 

OO 

£C  =  ~, 

m=l  y 

and 

5m  =  o(tpm)  and  =  o(t£J. 

Then  AWCGA  converges  for  any  f  G  X. 

Corollary  2.3.  Let  X  be  a  uniformly  smooth  Banach  space.  Assume  that  r  = 
{£},  t  G  (0,1].  Then  for  any  two  sequences  5,  p  G  Co  the  corresponding  AWCGA 
converges  for  any  f  G  X. 


Lemma  2.1.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u).  For  a  finite-dimensional  subspace  L  of  X  and  an  elemnt  f  G  X 
denote 


EL(f)  '■=  inf  ||/  —  Z||. 


Assume  that  an  element  g  G  L  and  a  functional  F  satisfy  the  following  conditions 

(2.6)  ll/1!!  <  Et(/)(l  +  a),  fL:=f-g,  oe[0,l]; 


(2.7)  J’(/i)  >  11/^11(1  -  &),  W|<1,  b  6  [0,1]. 

Then 

l^(5)l  <  inf(a  +  6  +  2p(3u||/||))/u. 

v>0 


Proof.  For  any  A  we  have  from  the  definition  of  p(u)  that 

(2.8)  II/1  -  Aj||  +  II/1  +  As||  <  2||/1||  (1  +  P(p|]])). 

Next,  assume  \F(g)\  =  (3  >  0.  Then  either  F(g)  =  f3  or  F(—g)  =  f3.  We  will  carry 
out  the  proof  under  assumption  F(g)  =  (3  and  note  that  the  case  F(—g)  =  (3  is 
similar.  We  have 


(2-9)  || fL  +  A <7 1 1  >  F(fL  +  A g)  >  ||/L||(1  -  b)  +  \(3 

and  by  (2.8) 


(2.10) 


II/1  -  Asll  <  ll/1!!  (1  +  b  +  2p(^jl))  -  A/3. 


On  the  other  hand  for  any  A 

II fL  ~  Xg\\  >  EL(f)  >  \\fL\\(l  +  a)-1  >  ||/L||(1  -  a). 
Therefore  for  any  A 

A/3  _  _  ,  1.  ,  „.,3A||/||, 

W\\- 

This  proves  the  lemma. 

We  will  need  the  following  simple  lemma  (see  [Tl] ) . 
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Lemma  2.2.  For  any  bounded  linear  functional  F  and  any  dictionary  V  we  have 

sup  F(g)  =  sup  F(f). 

g€T>  f€Ai(T>) 


Lemma  2.3.  Let  X  be  a  uniformly  smooth  Banach  space  with  modulus  of  smooth¬ 
ness  p(u).  Take  a  number  e  >  0  and  two  elements  f ,  fe  from  X  such  that 


\\f-n\<e 


and 


fe/A(e)  G  Ai(V), 

with  some  number  A(e) .  Then  for  the  AWCGA  with  r,  5,  p  we  have  for  m  =  1,2,... 
Em(f )  <\\fm-i\\  inf (1  +  5m-i  -  AtmA(e)-1(l  -  5m-i  -  +  2p(  —  *—-)), 

X  |  \jm—  1 1 1  |  |  /  m—  1 1 1 


where 


Pm-1  ■=  inf  (5m- 1  +  Tjm-1  +  2p(3v\\f\\))/v. 

v>0 


Proof.  We  have  for  any  A 


(2.11) 


||/m  — 1  A<pm||  A-  ||/m— 1  A  A(pm||  A  2||/m_i||(l  -A  p( 


A 


II/. 


m  —  1 


-)) 


and  by  1)  from  the  definition  of  AWCGA  and  Lemma  2.2  we  get 

Fm—  1  (Tm)  >  tm  sup  Fm-i(g)  — 

g£V 

tm  SUp  Fm-l(4>)  >  tmA(ey1Fm-l(fe)- 

0£Ai(X>) 

By  Lemma  2.1  we  obtain 

Fm-l(n  =  Fm-l(f  +  r  -  f)  >  Fm-l(f)  C  = 

Fm—l(fm—l  A  Cr  m—  l)  Fm  —  l(fm—l)  |  Fm—  1  (G  m—  i )  |  6  A 

||/m— 1 1| (1  -  <5m_i)  -  /3m_i  -  e. 

Thus  similarly  to  (2.9)  and  (2.10)  we  get  from  (2.11) 

(2.12)  Em(f)  <  inf  ||/m-i  -  Apm||  < 

A 

1 1 /m— 1 1 1  inf(l  +  5m- 1  —  A ,tmA(e)  1(1  —  5m  l  ~  wry - tw)  +  2p(-7— ; - tt))? 

A  |  \Jm—  1 1 1  ll/m— 1 II 
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what  proves  the  lemma. 

Proof  of  Theorem  2.2.  The  definition  of  {Em(f)}  implies  that  it  is  a  nonincreasing 
sequence.  Therefore  we  have 


lim  Em(f )  =  a. 

m—too 

We  prove  that  a  =  0  by  contradiction.  Assume  the  contrary  that  a  >  0.  Then  for 
any  m  we  have 

\\fm\\  >  Em(f)  >  a. 

We  set  e  =  a/ 4  and  find  fe  such  that 

11/ -/II  <  e  and  fe/A(e)  G  A\(T>) 

with  some  A(e).  It  is  clear  that  limm^oo  /3m  =  0.  We  choose  M  such  that  for  all 
to  >  M  we  have 

fim- 1  +  (Pm- 1  +e)/a  <  1/2. 

Then  by  Lemma  2.3  we  get 

Em(f)  <  ||/m-i ||  hrf (1  +  —  Afm,A(e)  1/2  +  2p(A/o;)). 

A 

Let  us  specify  6  :=  g^y  and  take  A  =  a^m(p,r,6).  Then  we  obtain 

Em(f )  <  ||/m  —  1 1|  (1  +  Sm- 1  “  2 Otm.^ra) 

and 

1 1  /m  1 1  <  ||/m— 1 1|(1  +  $  m—  1  2^m)(l  +  Vm  )  • 

Using  the  assumption  (2.5)  we  get  for  big  enough  to  that 

(1  +  $m—  1  20UX™)(1  +  A  1  @tmf,m- 


The  assumption 


oo 


^  '  tmfr 


m= 1 


=  (X) 


implies  that 


||/m  II  — >  0  as  TO  — >  (X). 

We  got  a  contradiction  which  proves  the  theorem. 

We  now  proceed  to  the  rate  of  convergence  of  the  AWCGA.  The  following  theo¬ 
rem  has  been  proved  in  [Tl]  for  the  WCGA. 
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Theorem  2.3.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u )  <  7 uq ,  1  <  q  <  2.  Then  for  a  sequence  r  :=  {tk}kLi,  tk  <  1, 
k  =  1,2,...,  we  hare  /or  any  f  G  *4i(T>)  that 

m 

k= 1  ^ 

with  a  constant  C{q1ry)  which  may  depend  only  on  q  and  7. 

Remark  2.2.  It  follows  from  the  proof  of  Theorem  2.3  in  [Tl]  that 

C(q,  7)  =  (2(47)^r)1/p  <  Cy/9 

with  absolute  constant  C. 

We  prove  here  the  same  rate  of  convergence  for  an  adaptive  AWCGA  where 
adaptive  means  that  sequences  5  and  77  are  determind  by  the  AWCGA  applied  to  a 
given  element  /  G  Ai{V). 

Theorem  2.4.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u )  <  7 uq ,  1  <  q  <  2.  Let  a  weakness  sequence  r  :=  {tk}kLi,  tk  <  1, 
k  =  1,2,...,  be  such  that 

OO 

E‘k  =  p=Aj- 

k= 1  y 

For  a  given  f  G  *4i(T>)  apply  the  AWCGA  with 

8m- 1  ■=  tUfm-i\\p3-p(16Aq)-\  m  =  1,2,...; 

17m-!  :=  C^m-i(/)p3-p(16A,)-1J  m  =  2, . . . , 

where 

Aq  :=  4(87)“. 

Then  we  have 

m 

wrXw  <  <v/j(i +'£%r1/p,  p :=  A- 

fc=i  ^ 

with  absolute  constant  C. 

Proof.  By  Lemma  2.3  with  e  =  0  and  A(e)  =  1  we  have  for  /  G  A\(fD)  that 
(2.13)  Em(f)  < 

\\fm-i\\ inf (l  +  -  Xtm{l  -  (5m_i  -  Pm-i/\\fm-i\\)  +  27 (  A  M)9). 

A  ||/m— 1  M 
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We  estimate  j3m-i  by  choosing 

v  =  \\fm-i\\^3-p/Aq. 


We  have 

Pm- 1  <  (5m- 1  +Vm-l)/v  +  2'y3qVq~1  <  (1/16  +  1/16  +  l/4)||/m-l||  =  |||/m-l| 

O 

Using  8m-i  <  1/16  we  get  from  (2.13) 

(2.14)  Em(f )  <  1 1  fm—  1 1 1  inf  (l  +  5m_i  —  —  A tm  +  2y(  — - p-)9). 


II/. 


m  l  | 


We  choose  A  from  the  equation 


iAt’"  =  27(^r 

what  implies  that 

A  =  ||/m_i||A(87)-^i«rT  =  ±ttI\\fm-i\\p/Aq. 


With  this  A  using  the  notation  p  :=  we  get  from  (2.14) 

Em(f)  <  H/m-lIKl  +  ^m-i-^AO  <  1 1  /m_  1 1 1  (1  -  tPJ \ fm_  1  \ \*  /  Aq)  < 

^-1(/)(l+C^-1(/)P/(2Ag))(l-CII/rn-1||PMg)  < 
Em^(f)(l-CEm^(fr/(2Aq)). 

Raising  both  sides  of  this  inequality  to  the  power  p  and  taking  into  account  the 
inequality  xr  <  x  for  r  >  1,  0  <  x  <  1,  we  obtain 


EUfY  <  Um-l(/)P(l  -  C^m-l(/)7(2^))- 
By  Lemma  3.1  from  [T3]  using  the  estimate  \\f\\p  <  1  <  Aq  we  get 

m 

Em(fY  <  24,(1  +  X]  K)-1 

n=  1 

what  implies 

m 

ii/ji  <c7i/5(i  +  5y«)-i/i'. 

n= 1 

Theorem  2.3  is  proved  now. 

We  discussed  above  performance  of  AWCGA.  The  AWCGA  is  defined  in  a  way 
of  controlling  relative  errors  of  approximation  of  norming  functional  and  best  ap- 
proximant  (see  the  definition  of  AWCGA).  We  now  discuss  a  modification  of  the 
AWCGA  with  control  of  absolute  errors  of  approximation.  Let  three  sequences 
r  =  {ffcjfcli,  e  =  {efc}£L0,  a  =  of  numbers  from  [0, 1]  be  given. 
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Approximate  Weak  Chebyshev  Greedy  Algorithm  (a)  (AWCGA(a)).  We 

define  /o  :=  //’e,a  :=  /.  Then  for  each  m  >  1  we  inductively  define 
1).  Fm_ i  is  a  functional  with  properties 

ll-Pm  — 1||  —  I;  Tm— 1  (/m— l)  A  ||/m— 1||  lj 


and  <pm  :=  is  any  satisfying 


Ffn  1  {(fim)  A  tm  SUp  Fm—i  ((?)• 
gdV 


2).  Define 
and  denote 


Let  Gm  E  be  such  that 


:=  span{pj}£Ll5 
Em{f)  :=  inf  ||/-<p||. 


||  /  —  Gm||  <  Em(f )  T  om. 

3).  Denote 

f  •=  fT’e’a  ■=  f  -  G 

The  following  analog  of  Theorem  2.2  holds  for  AWCGA(a). 

Theorem  2.5.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u).  Assume  that  sequences  r,  e,  a  satisfy  the  conditions:  for  any 
6  >  0  we  have 

OO 

^  ^  7”?  OO 

ra=l 

and 

oo  oo 

em  <  oo  and  ^  ^  OO. 

771=1  771=  1 

Then  for  any  f  G  X  we  have 


lim 

hi — y  OO 


II/, 


T,e,a\ 

m  I 


0. 


The  proof  of  this  theorem  is  similar  to  the  proof  of  Theorem  2.2.  We  will  not 
present  this  proof  here  and  remark  that  the  only  new  ingredient  of  the  proof  of 
Theorem  2.5  is  the  following  simple  lemma. 


Lemma  2.4.  Let 

OO  OO 

^  '  7 m  =  OO)  ^  '  O^m  F  OO,  £  [0,1]. 

m= 1  m= 1 

Assume  that  a  nonnegative  sequence  satisfies  the  relation 

•Em  —  — l(l  7m)  "f~  Efl  —  1,2,.... 


lim  Xm  = 

771 — y  00 


Then 


0. 
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3.  Convergence  and  rate  of  approximation  of  AWRGA 

The  following  two  theorems  on  WRGA  have  been  proved  in  [Tl] . 

Theorem  3.1.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p{u).  Assume  that  a  sequence  r  :=  {tk}^L  1  satisfies  the  condition:  for 
any  6  >  0  we  have 

OO 

^  '  tm£,m(pi'r,  0)  =  OO. 
m  =  l 

Then  for  any  f  G  AiifD)  we  have 


lim 

m — y  oo 


11/ 


r*,r  | 
m  I 


o. 


Theorem  3.2.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u)  <  7 uq ,  1  <  q  <  2.  Then  for  a  sequence  r  :=  {tk}kLi>  tk  <  1, 
k  =  1,2, ,  we  have  for  any  f  G  Ai{V)  that 

m 

ll/VII  <  Ci(3,7)(i  +  E«_1/P'  p  ■■=  rry. 

k=  1  ^ 

with  a  constant  Ci(q,j)  which  may  depend  only  on  q  and  7. 

Remark  3.1.  It  follows  from  the  proof  of  Theorem  3.2  in  [Tl]  that 


with  absolute  constant  C. 

We  prove  here  analogs  of  Theorems  2.2  and  2.4  for  the  AWRGA. 

Theorem  3.3.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u).  Assume  that  sequences  r,  5,  r/  satisfy  the  conditions:  for  any 
6  >  0  we  have 

OO 

(3.1)  y:  tm£m(p,T,9)  =  OO 

m=  1 


and 

(3.2)  8m  =  o(tm£m(p,T,  9))  and  r/m  =  o(tmCm{p,r,6)). 

Then  for  any  f  G  A\([D)  we  have 

lim  ||/-.-Al||=0. 
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Corollary  3.1.  In  the  particular  case  of  t  —  {f},  t  >  0  the  assumption  (3.1)  is 
satisfied  and  the  assumption  (3.2)  takes  a  form  8m  =  o(l)  and  r/m  =  o(l).  Thus  in 
the  case  r  =  {f},  t  >  0  the  AWRGA  converges  for  each  f  G  A\(T>)  if  S,rj  G  Co- 

Lemma  3.1.  Let  X  be  a  uniformly  smooth  Banach  space  with  modulus  of  smooth¬ 
ness  p{u).  Then  for  any  f  G  Ai{T>)  we  have  for  m  =  1,2,... 

1 1  fm  1 1  <  1 1  fm—  1 1 1  U  +  8m- 1  “  Afm(l  -  6m  l)  +  2p(-y— - 7y))(l  +  7?m). 

ll/m-lll 


Proof.  We  have 


fZ  :=  /  -  ((1  -  A™)©-.!  +  XmVZ)  =  fZ- 1  -  -  G^) 

and 

ll/m  II  <  inf  II  -  MtZ  -  GZ-l)  11(1  +  Vm). 

U  \  A  \  1 

Similarly  to  (2.11)  we  have  for  any  A 

(3.3)  ||/“r_1  -  KpZ  ~  GZ-i) II  +  ll/“r-i  +  A (pZ  ~  GZ-i)\\  < 

A 1 1  uDar  —  C 1ar  II 

2||/r-lll(l+P(  " .far  r  ))• 

1 1 J  m—  1 1 1 


Next  we  get  for  A  >  0 


ll/m-l  +  -  G^)!!  >  C-!(C- 1  +  A(*C  -  G^))  > 


11/m-ilKl  -  *m-i)  +  A F^ZpZ  -  GZ-i)  > 
WfZ-iWO-  -  8m- 1)  +  a  tm  sup  F%_x(g  -  GZ^). 

g£V 


Using  Lemma  2.2  we  continue 


=  WfZ-i\\(i  -  sm-i)  +  \tm  sup  FZ)_M-GZ-i)> 

0GAi(X>) 

11/m-ilKl  -  «m-l)  +  AU/“ -rllU  - 

Therefore,  by  a  trivial  estimate  \\<pZ  ~  GZ- ill  <  2  we  obtain  from  (3.3) 

(3-4)  WfZ-,-KvZ-GZ-,)\\< 

1 1  fm  —  1 1 1  (  +  8m- 1  ~  Afm(l  -  8m- 1)  +  2p(j7^ - TT  )  ) , 

II J  m—  1 II 


which  proves  Lemma  3.1. 
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Proof  of  Theorem  3.3.  By  the  definition  the  sequence  {\\fff  ||}  is  nonincreasing.  Let 


!im  11/ 


ar  | 
m  I 


=  a . 


Similarly  to  the  proof  of  Theorem  2.2  we  will  use  the  contradiction  argument. 
Assuming  a>0we  get  from  Lemma  3.1  for  big  enough  m 


(3.5) 


n/rii  <  ii/, 


ar 

m— 


inf  (1  +  Sm-i 

0<A<1 


—  Xtm/2  +  2p( 


2A 


11/ 


ar 

m—  1 1 


'))(!  +  Vm)- 


Specifying  6  =  ck/16  and  taking  A  =  a^m(p,T,6)/ 2  we  obtain  from  (3.5) 


(3.6)  HO  <  IIOillU  +  Oi  -20tm£m)(l  +  rhn). 

The  remaining  part  of  the  proof  repeats  the  arguments  from  the  proof  of  Theorem 

2.2. 

Theorem  3.4.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u )  <  ryuq ,  1  <  q  <  2.  Let  a  weakness  sequence  r  :=  {ffcjfcli,  tk  <  1, 
k  =  1,2,...,  be  such  that 


J2tPk  =  °°' 

k=i 


V  = 


Q 

q-1 


For  a  given  f  e  Ai(V)  apply  the  AWRGA  with 

sm- 1  =  r,m  :=  CII/r-ill^B,)-1,  m  =  1, 2, ... ; 


where 

Bq  :=  8(87)  ^2P. 

Then  we  have 


WfZ-^W  <C71/3(i  +  ^iD^1/p, 

k=l 


p  := 


Q 

q~l ’ 


with  absolute  constant  C. 

Proof.  Using  that  5m_i  <  1/2  we  get  by  Lemma  3.1 


no  <  n/ 


ar  | 
m—  1 1 


inf  (1  +  Sr, 

0<A<1 


1  —  Xtm/2  +  2j( 


2X 


11/ 


ar 

m 


r)g)(l  +  Vm)- 


Xtm/A  =  2  7( 


2A 


11/ 


ar 

m  1 1 


Choosing  A  from  the  equation 
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we  find 

A  =  (87)-1/(«-1)2--A/<»-1'||/“’L1ir  <  1 

and 

ii/rii<ii/r-iii(i+^-i-2cii/r-iii7^)(i+^). 

Using  the  definition  of  dm_i,  r}m,  and  Bq  we  obtain 

ll/ml  <  11/m-llK1  -CII/m-lir/7) 
and  complete  the  proof  in  the  same  way  as  in  the  proof  of  Theorem  2.4 

Let  us  compare  Theorem  3.3  with  Theorem  3.4  from  [DGDS]  (see  Theorem  3.5 
below).  We  remind  some  definitions  from  [DGDS].  An  incremental  sequence  is  any 
sequence  a\,  a,2,  ■  ■  ■  of  X  so  that  a±  G  V  and  for  each  n  >  1  there  are  some  gn  G  V 
and  Xn  G  [0, 1]  so  that 


&n  —  (1  An)dn_  1  +  Xngn,  (do  —  0). 

We  say  that  an  incremental  sequence  is  e-greedy  (with  respect  to  /)  if 

(a0  =  0) 

(3.7)  ||/-a„||<  inf  ||/ -  ((1  -  A)an_i  +  \g)\\  +  en,  n=  1,2,.... 

\€[0,l]-,g€V 

Theorem  3.5([DGDS]).  Let  X  be  a  uniformly  smooth  Banach  space,  and  let 
e  =  be  such  that 

OO 

(3.8)  en  <  oo. 

71=1 

Then  any  e-greedy  (with  respect  to  f)  incremental  sequence  converges  to 

/• 


In  order  to  find  a  sequence  {an}  satisfying  (3.7)  one  should  solve  a  sequence  of 
optimization  problems: 

(3.9)  inf  ||/ -  ((1  -  A)an_i  +  Xg)\\  +  en,  n  =  1,2,... 

AG[0,1  \-,g€V 

within  accuracy  en  satisfying  (3.8).  It  is  clear  that  the  most  difficult  part  of  (3.9) 
is  an  optimization  over  g  G  V.  The  most  important  advantage  of  the  WRGA 
and  AWRGA  is  that  they  provide  a  way  of  obtaining  a  good  element  prm  (or 
from  the  dictionary  by  checking  much  weaker  condition  that  being  optimal  within 
accuracy  en.  In  the  AWRGA  the  way  of  obtaining  a  consists  of  two  steps:  first, 
we  find  an  approximation  of  the  norming  (peak)  functional  of  the  residual 
with  high  accuracy  (F"r  i(/“r  J  >  \\fZ-i\li1  ~  7  i));  second,  we  look  for  pff 
satisfying  a  very  weak  (comparing  to  being  optimal)  condition 

Kr- i(tC  -  G%- 1)  >  tm  sup  -  GZ_X). 

g£V 

Other  place  in  the  AWRGA  where  we  need  high  accuracy  is  the  optimization  over 
A  G  [0,1].  Clearly,  the  above  two  tasks  with  high  accuracy  are  much  easier  than 
the  above  selection  of  a  dictionary  element  p°ff. 
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4.  Constructive  nonlinear  trigonometric  to-term  approximation 

We  describe  the  approximation  method  in  detail  in  the  univariate  case.  Consider 
the  real  Lp( T)  space  with 

\\ flip '■=(-[  \f(x)\pdx)1/p,  l<p<oo; 

77  J  T 

H/lloo  :=  max  \f(x)\,  f  -  continuous. 

Let  1  <  p  <  oo.  Denote  Tp  the  real  trigonometric  system  normalized  in  Lp 

2~1/p,  cp  sin  a;,  cp  cosx, . . . 

where 

cp  =  (—  [  |  sin  a: | pdx)~1tlp. 

77  J  T 

It  is  clear  that  C1  <  cp  <  C 2  with  two  absolute  constants  C 1  and  C2.  Let  T{N) 
denote  the  set  of  trigonometric  polynomials  of  order  N. 

We  discuss  first  a  simpler  construction  based  on  the  particular  case  of  p  =  4  in 
order  to  illustrate  the  idea  of  the  construction.  For  a  trigonometric  polynomial 

N 

t(x)  =  ao/2  +  cos  kx  +  bk  sin  kx) 

k= 1 

denote 

N 

||£|U  :=  | o-o |  +  ^^(lafc|  +  |6fc|). 

k= 1 

Then  by  Theorems  2.3  (or  2.4)  and  3.2  (or  3.4)  each  of  the  algorithms  WCGA  (or 
AWCGA),  WRGA  (or  AWRGA)  with  r  =  {1/2},  q  =  2  provide  a  constructive  way 
of  approximation  in  the  L4-norm:  for  any  t  G  T(N)  we  get  a  to- term  trigonometric 
polynomial  Gm(t )  G  T{N)  such  that 

(4.1)  ||t-Gm(t)||4<G{TO-1/2||t|U 

with  absolute  constant  C\.  By  Nikol’skii’s  inequality  this  implies 

(4.2)  ||t  -  Gm(t)\\oo  <  CzN^m-^WtU 
with  absolute  constant  G2. 

We  will  build  our  constructive  approximation  operators  Ak(N,m )  inductively 
from  level  k  =  1  up  to  arbitrary  level  k.  We  begin  with  the  level  k  =  1.  We  set  for 
t  G  T{N) 

A1(N,m)(t)  :=  Gm(t). 
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Then  (4.2)  implies  for  to  <  N 

(4.3)  \\t-  A1(AT,m)(f)||00  <  CtN^m-^WtWA  <  A1N1/4(N/m)1/2m-1/2\\t\\A. 

We  continue  the  construction  inductively.  Suppose  we  have  built  operators  Ak(N,  m) 
such  that  for  any  t  G  T(N) 

(4.4)  || t- Ak(N,m)(t)\\00  <  AkN2  *  1  (iV/m)1/2m_1/2||t||J4- 

We  will  build  operators  Ak+1(N,  m)  and  will  control  the  constant  A^+ We  will 
carry  on  the  construction  for  even  numbers  to. 

Step  1.  Let  t  G  T{N).  We  approximate  t  using  (4.1) 

||t-G'm/2Wll4<C'1TO-1/2||t|U. 


Denote 

h  :=  (t  —  Gm/2{t)) /\\t  —  Gm/2 (t) 1 1 4 . 
Step  2.  Take  a  positive  number  D  and  decompose 


h  =  hD  +  tiD'i  hn(x)  := 


h(x)  if  | h(x)  |  <  D , 
0  otherwise. 


We  need  the  following  simple  well  known  lemma. 

Lemma  4.1.  Assume  p  G  [2,  oo)  and  \\f\\p  =  1.  Then 

||/i?||oo<£>  and  \\fD\ \2<Dx-pl2. 


By  Lemma  4.1  with  p  =  4  we  get 

HMoo  <D  and  \\hD\ |2  <  D_1. 

We  would  like  to  work  with  trigonometric  polynomials  instead  of  working  with 
ho  and  hD .  Let  Vn  be  the  de  la  Vallee  Poussin  operator.  Consider  V/v(^r»)  and 
VN(hD).  We  have 

h  =  VN(h)  =  VN(hD)  +  VN(hD) 


and 


IIVaKMIIoo  <  3 D-  \\VN(hD)\\2  <  D -1;  ||Viv(^)|U  <  2iV1/2D-1. 

Step  3.  We  approximate  V]\[(hD )  G  T(2N)  using  operators  from  level  k.  By  (4.4) 
we  have 


\\VN(hD)-Ak(2N,m/2)(VN(hD))\\00<Ak(2N)2~k~13(N/m)1/2m-1/2\\VN(hD)\\A. 
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For  t  G  T{N)  define 

Ak+1(N,m,D)(t)  :=  Gm/2(t)  +  || t  -  Gm/2(t)\\4Ak(2N,m/2)(VN(hD)). 
Taking  into  account  that  h  G  T{N)  we  get 


t  -  Ak+1(N ,  to,  D)(t)  =  h\\t  -  Gm/2(t) ||4  -  || t  -  Gm/2(t)\\4Ak(2N,  m/2)(VN(hD ))  = 

lit  -  Gm/2(t)||4(h  -  Ak(2N,m/2)(VN(hD)))  = 

II t  -  Gm/2{t)\\4(VN(hD)  +  VN(hD)  -  Ak(2N,m/2)(VN(hD))). 

Therefore, 

(4.5)  \\t  —  Ak+1(N,  m,  D)(t)||oo  < 


||t  -  Gm/2(t)\U(3D  +  Tfc(2iV)2"fe"16(iV/m)1/2m-1/2iV1/2JD-1)  < 
(3  D  +  Ak{2N)2~k~16(N/m)D-1)C1m-1/2\\t\\A. 

Step  4.  Choose 


D  =  D(N,  m,  k)  :=  ( 2Ak{2N )2  "  "(iV/m))172. 


By  (4.5)  we  obtain 

(4.6)  ||t-^+1(Ar,m,D)(t)|U 

with 

(4.7)  A'k+1  :=  6C121'222"‘"X/2  <  C34/2. 

We  remind  that  we  have  proved  (4.6)  with  the  constant  A'k+1  from  (4.7)  under 
assumption  that  to  is  an  even  number.  We  complete  the  construction  by  setting 

Ak+1(N,  to)  :=  Ak+1(N,  2[m/2],  D(N,  2[m/2],  k)),  to  >  2. 

Clearly  (4.6)  implies 

(4.8)  ||t-Tfc+1(iV,m,JD)(t)||00  <Tfc+1iV2"fe"2(iV/m)1/2m-1/2||t|U 

for  all  to  with  Ak+ 1  =  2T(,+1 .  The  relation  (4.7)  combined  with  A\  =  C2  (see 
(4.3))  implies  that  Ak  <  C4  for  all  k. 

Let  N  be  given.  Choose  k  satisfying  2k+1  >  IniV.  Then  (4.4)  gives  for  any 
t  G  T{N)  the  estimate 

(4.9)  \\t  -  Ak(N,m)(t) ||oo  <  C5(N1/2  /m)\\t\\A 
for  any  to. 

We  now  proceed  to  a  more  elaborate  construction  that  gives  the  following  esti¬ 
mate. 
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Theorem  4.1.  There  exists  a  constructive  algorithm  A(N,  to)  such  that  for  any  t  G 
T(N )  it  provides  a  m-term  trigonometric  polynomial  A(N,  m)(t )  with  the  following 
approximation  property 

(4.10)  || t  -  A(JV,to)(*)||oo  <  Oto  1/2(ln(l  +  lV/TO))1/2||t|U 
with  an  absolute  constant  C . 

Proof.  We  will  construct  an  analog  of  the  sequence  of  operators  {Ak(N,  to)}  con¬ 
structed  above.  A  new  ingredient  of  this  construction  is  the  following.  We  will  now 
approximate  t  in  the  Lp- norm,  p  G  [4,  oo )  instead  of  the  L4-norm  and  will  optimize 
over  p. 

Let  A  and  to  be  given  and  let  t  G  T(N).  We  use  either  the  WCGA  (AWCGA) 
or  the  WRGA  (AWRGA)  with  r  =  {1/2},  q  =  2,  T>n  =  Tp  fl  T(A)  to  approximate 
t  by  m-term  trigonometric  polynomial  in  the  Lp-norm,  p  G  [4,  oo).  By  Theorems 
2.3  (2.4)  or  3.2  (3.4)  with  X  =  T{N)p  where  T(N)p  is  the  T(N)  equipped  with 
the  Lp-norm  we  get 

(4.11)  l|i-GS,(f)llP<C'6C'(2,7)m-1/2||t|U. 

Let  us  estimate  the  constant  C( 2, 7).  By  (1.5)  we  obtain  7  =  (p  —  l)/2.  Thus  by 
Remark  2.2  or  Remark  3.1  we  get 

(4.12)  C6C(2,1)<C7p1/2. 

We  define  the  level  k  =  1  algorithms  Ap(N,m )  by 

(4.13)  Ap(N,m)(t)  =  teT(N). 

We  note  that  by  construction  Ap(N,m)(t )  G  T{N).  By  Nikol’skii’s  inequality  we 
get  from  (4.11)  (4.13) 

(4.14)  \\t-  AJ^toHOHoo  <  Cap^N^m-^WtU  < 

Csp1^2 N1^4(N/m)p^ m^^'^^WtWA,  to  <  N. 

We  note  here  that  taking  pn  :=  In  IV  we  get  from  the  first  inequality  in  (4.14) 

(4.15)  ||i-i4j(JV,TO)(i)||oo  <  C'(lniV)1/2TO_1/2||f||y4 

with  an  absolute  constant  C.  Thus  the  rest  of  the  proof  will  be  devoted  to  replacing 
In  A"  by  ln(l  -f  A/to)  in  (4.15). 

As  in  the  case  p  =  4  we  continue  the  construction  by  induction.  Suppose  we 
have  built  operators  Ap(N,m )  such  that  for  any  t  G  T(A),  p  G  [4,  00) 

(4.16)  ||t-Aj(A,TO)||00  <  A^A2"fe"1(A/TO)^TO-1/2||f|U. 
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We  will  make  steps  similar  to  those  from  above. 

Step  1.  Let  t  G  T{N)  and  let  m  be  an  even  number.  We  approximate  t  using 
(4.11),  (4.12) 

(4.17)  ||t  -  G^/2(t) ||p  <  <V'2m-1'J||tm. 

Denote 

h[p]  :=  {t-&m/2(t))/\\t-<ym/2mp- 

Step  2.  Take  a  positive  number  D  and  decompose 

h[p]  =  hD[p\  +  hD[p\. 

By  Lemma  4.1  we  get 

I|/id[p]IU  <  D  and  ||ftD[p]||2  <  D1-"/2 

and,  therefore, 

\\VN{hD[p])\\oo  <  3 D;  \\VN(hD[p])\\2  <  D1-^2-  \\VN(hD[p])\\A  <  2N1'2D1-p/2. 

Step  3.  We  approximate  V]\[(hD )  G  T(2N)  using  operators  from  level  k.  By  (4.16) 
we  have 

\\VN(hD\p])-A$(2N,m/2)(VN(hD\p]))\\x  < 

Al{2N)2~k~1{4N/m)^21/2m-1'2\\VN{hD)\\A. 

For  t  G  T{N)  define 

Al+l(N,m,D)(t)  :=  G^/2(t)  +  ||t  -  Gl/2(t)\\pAkp(2N,  m/2)(VN(hD\j>]))- 
Similarly  to  the  case  p  =  4  (see  (4.5))  we  get 

(4.18)  \\t-Ahp+\N,m,D)(t)\\x  < 

lit  -  G^/2(t)||„(3£  +  ^(2iV)2_‘_,6(iV/m)5ifelD1-»/2). 

Step  4.  Choose 

Dp  =  Dp(N,  to,  k)  :=  (2Apk{2N)2~k~1  (N/m)^)j2/p . 

By  (4.17)  we  obtain  from  (4.18)  for  even  m 

(4.19)  \\t- A^+1(N,m,Dp)(t)\\00  <  A^N2  k  2 (N/m)^m^1/2\\t\\A 
with 


(4.20) 


Ar4,  <  A\  <  Cup1/2. 
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We  note  that  (4.20)  implies 

(4.21)  Ap  <C12p^^. 

We  set 

Ap+1(N ,  to)  :=  A^+1(N,  2[m/2],Dp(N,  2[m/2],k)) 

and  obtain  (4.16)  with  k  replaced  by  k  +  1  and  a  constant  Apk+1  =  2AP,11. 

Let  N  and  to  be  given.  First  we  choose  k  satisfying  2fc+1  >  InN.  Next  we  choose 
p  =  2  +  ln(l  +  N/m).  Then  (4.16)  and  (4.21)  give  for  any  t  G  T(N)  the  estimate 

(4.22)  || t  -  Ap(N,  m)(t)||oo  <  C'13m-1/2(ln(l  +  N/m))1/2  ||f|U 

for  any  m.  This  completes  the  proof  of  Theorem  4.1. 

The  same  technique  can  also  be  used  in  the  multivariate  case.  Let  Lp(Td)  be 
the  real  Banach  space  with 


Wf\\p:=(-d  f  \f(x)\pdx)1/p,  1  <  p  <  oo; 

7T  Jjd 

ll/lloo  :=  max  \f(x)\,  /-continuous. 

x£Td 

Denote  Td  :=  T  x  •  •  •  X  T  ( d  times)  the  real  multivariate  trigonometric  system. 
Let  N  =  (Ni, . . . ,  Nd).  Denote  T( N)  the  space  of  trigonometric  polynomials  with 
degree  Nj  in  the  variable  Xj,  j  =  1  ,...,d.  Let  u(N)  be  the  dimension  of  T(N). 
We  formulate  a  generalization  of  Theorem  4.1  for  the  d-dimensional  case  and  note 
that  the  proof  repeats  the  proof  of  Theorem  4.1. 

Theorem  4.2.  There  exists  a  constructive  algorithm  A(N,  m)  such  that  for  any  t  G 
T(N)  it  provides  a  m-term  trigonometric  polynomial  A(N,  m)(t)  with  the  following 
approximation  property 

(4.23)  ||t  -  T(N, m)(t)||oo  <  C'(d)m_1/2(ln(l  +  u(N)/m))1/2 \\t\\A 

with  a  constant  C(d)  which  may  depend  on  d. 

This  theorem  can  be  applied  to  studying  m- term  trigonometric  approximation 
of  function  classes.  We  will  consider  here  some  examples.  In  the  paper  [DTI]  the 
following  two  types  of  function  classes  were  studied  from  the  point  of  view  of  best 
to- term  trigonometric  approximation.  We  begin  with  the  first  class.  For  0  <  a  <  oo 
and  0  <  q  <  oo,  let  Tff  denote  the  class  of  those  functions  in  L i(Td)  such  that 

i svt  ■■=  ( E  (max(r  i*=ii . Mr’a/mr))17’  <  i. 

k€Zd 

The  following  theorem  has  been  proved  in  [DTI], 
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Theorem  4.3.  If  a  >  0  and  A  :=  a/d  +  l/q  —  1/2,  then  for  all  1  <  p  <  oo  and  all 
0  <  q  <  oo 

C1m~x  <  am{Tq,  Td)p  <  C2m~x,  a  >  d(  1  -  l/g)+, 

with  Ci,  C2  >0  constants  depending  only  on  d,  a,  q. 

The  second  class  is  defined  as  follows.  Let  ck  >  0,  0<r,  s  <  oo  and  Bf(LT ) 
denote  the  class  of  functions  such  that  there  exist  trigonometric  polynomials  Tn  of 
coordinate  degree  2n  with  the  properties 

OO 

/  =  ET"'  ll{2"“l|T„IW7,olk(Z)  <  1- 

71=0 

The  following  theorem  has  been  proved  in  [DTI]  for  these  classes. 

Theorem  4.4.  Let  1  <  p  <  oo,  0  <  r,  s  <  oo;  and  define 

(  d(l/r  —  l/p)+,  0<r<p<2  and  1  <  p  <  r  <  oo 

[  max(d/r,d/2),  otherwise. 

Then  for  a  >  a(p,r),  we  have 

Cim~^  <  a(Bf(Lr),  Td)p  <  C2m~^,  p  :=  a/d  -  (1/r  -  max(l/p,  l/2))+, 

with  Ci,C2  depending  only  on  a,p,r,  and  d. 

It  was  proved  in  [T4]  that  in  the  case  1  <  p  <  2  the  rate  of  best  to- term  ap¬ 
proximation  in  Theorem  4.3  can  be  realized  by  the  Thresholding  Greedy  Algorithm 
Gm(-,Td),  that  is  by  a  constructive  method.  It  is  well  known  that  for  approxima¬ 
tion  by  trigonometric  polynomials  of  degree  m x!d  in  each  variable  one  has 

(4.24)  Em(Bf(LT),Td)p  :=  sup  Em(f,Td)p  x  m-^/d+(i/T-i/P)+ 

provided  a/d—  {1/r  —  l/p)+  >  0.  Comparing  (4.24)  with  Theorem  4.4  we  conclude 
that  in  the  case  0<T<p<2oil<p<r<oo  the  rate  of  crm(Bf{LT),Td)p 
can  be  realized  by  approximation  by  trigonometric  polynomials  of  degree  m1C  in 
each  variable.  Thus  in  the  case  0<r<p<2orl<p<r<oo  there  is  a 
simple  constructive  method  that  realizes  am{Bfl{LT),Td)p.  The  remaining  case  is 
1  <  r  <  p  <  oo,  2  <  p  <  oo.  In  a  subcase  of  the  remaining  case  when  p  <  oo 
it  has  been  shown  in  [DKT]  that  the  WCGA  (or  WRGA)  can  be  used  to  build  a 
constructive  method  of  realizing  a.m{Bf{LT),Td)p.  It  was  done  in  the  following 
way.  In  [DTI]  the  only  nonconstructive  step  of  the  proof  of  Theorems  4.3  and  4.4 
in  the  case  2  <  p  <  oo  was  hidden  in  the  following  inequality  (see  [DTI,  Corollary 
5.1]) 

(4.25)  1  +  ln+  —f2, 

TO 

where  Td  denotes  the  subsystem  of  the  trigonometric  system  Td  which  forms  a  basis 
for  the  space  of  trigonometric  polynomials  of  coordinate  degree  n.  The  inequality 

(4.25)  was  proved  in  [DTI]  with  the  help  of  the  following  Gluskin’s  theorem  [G]. 


28 


V.N.TEMLYAKOV 


Theorem  4.5.  There  exist  absolute  constants  C\  and  0  <  8  <  1  such  that  for  any 
finite  collection  V  of  M  vectors  from  the  unit  Euclidean  ball  B ^  ofM.N,  there  is  a 
vector  z  G  M.N  with  \zf  =  0, 1,  i  =  1, . . . ,  N,  ||z||^jv  >  5N ,  and 

max\(v,z)\  <  Ci(l  +  ln+  ^)1/2. 
v£V  iV 

It  was  pointed  out  in  [DKT]  that  in  the  case  2  <  p  <  oo  the  Weak  Chebyshev 
Greedy  Algorithm  with  the  weakness  sequence  {t},  t  G  (0, 1]  provides  a  constructive 
way  to  get  an  analog  of  (4.25).  This  follows  immediately  from  Theorem  2.3:  for 
/  G  Ai(Tn)  we  have 

(4.26)  \\fa%  <  C(p,  t)™-1/2,  2  <  p  <  oo. 

Thus  the  only  nonconstructive  step  in  the  proof  of  upper  estimates  in  Theorems 
4.3  and  4.4  was  made  constructive  for  p  <  oo. 

In  the  same  way  as  in  [DKT]  one  can  use  Theorem  4.2  instead  of  (4.26)  to 
make  the  proofs  of  Theorems  4.3  and  4.4  ([DTI])  constructive  in  the  case  p  = 
oo.  Therefore,  we  now  have  constructive  proofs  of  Theorems  4.3  and  4.4  in  all 
cases.  It  is  interesting  to  compare  this  situation  with  the  situation  on  finding  a 
constructive  proof  for  Kolmogorov’s  widths  of  the  above  function  classes.  We  will 
make  a  comment  only  on  classes  Bf(LT )  in  the  case  r  =  2,  p  =  oo.  We  remind  the 
definition  of  the  Kolmogorov  width 


dm(F,X) 


inf  sup  inf 

f£  F  Cl 


11/  -T 


3= 1 


i  Tj  I 


By  Kashin’s  [K]  result 

(4.27)  dm(Bf(L2),  Loo)  x  m-a/d,  a  >  d/2. 

The  estimate  (4.27)  is  only  an  existence  theorem  and  it  is  an  interesting  open 
problem  to  find  a  constructive  proof  (constuct  . . . ,  tpm)  of  (4.27). 

One  can  check  that  the  proof  of  Theorem  4.1  works  in  the  following  more  general 
situation.  Let  $  :=  {(f)j}°T  1  be  a  uniformly  bounded  orthonormal  system  defined 
on  a  bounded  domain.  Denote 

T(iV)  :=span{0i,...,0iv} 

and  assume  that  the  system  $  admits  a  sequence  of  the  de  la  Vallee  Poussin 
operators: 

(VP)  There  exist  two  positive  constants  K\  and  Ab  such  that  for  any  N  there  is 
an  operator  V/fj  with  the  properties 


VStfj)  =  \Njtj, 
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A nj  =  1  for  j  G  [1,  N],  XNj  =  0  for  j  >  K\N, 


(4.28) 


\Vn\\lp^lp  <  K2,  for  l<p<oo  and  all  N. 


For  a  system  $  having  the  (VP)  property  we  can  easily  derive  from  (4.28)  and 
uniform  boundedness  of  $  that 

<  CNl'\ 

By  interpolation  theory  of  operators  we  get  from  here  and  from  (4.28)  with  p  =  oo 
that 

Ptfll l,-h.„<CN1/*,  p  6  (2,oo). 

The  last  inequality  implies  the  Nikol’skii  inequality 

Il0ll=o  <  CN^mr,  4>  e  *(JV),  p  e  (2,  oo). 

Thus  the  $  has  all  properties  needed  in  the  proof  of  Theorem  4.1.  Therefore,  we 
have  the  following  generalization  of  Theorem  4.1.  Denote 


N 


N 


ci^i IU 


j= i 


3  = 1 


Theorem  4.6.  Let  $  :=  be  a  uniformly  bounded  orthonormal  system 

defined  on  a  bounded  domain.  Assume  $  has  the  (VP)  property.  Then  there  exists 
a  constructive  algorithm  A($>,  N,m)  such  that  for  any  (j)  G  $(1V)  it  provides  a 
m-term  $  -polynomial  A($,N,m)(<f))  with  the  following  approximation  property 

||0  — A($,V,m)(0)||oo  <C'm-1/2(ln(l  +  iV/m))1/2||0|U 

with  a  constant  C  which  may  depend  on  $. 

We  note  that  the  decomposition  technique  used  in  the  proof  of  Theorem  4.1 
is  a  standard  tool  in  the  interpolation  of  operators.  The  idea  of  combining  the 
decomposition  technique  with  inductive  way  of  constructing  approximations  is  also 
known  in  approximation  theory.  For  instance,  it  has  been  used  recently  in  [Da]. 

5.  The  Discrepancy  Estimates 

Let  1  <  p  <  oo.  We  remind  the  definition  of  the  Lp  discrepancy  (the  Lp-star 
discrepancy)  of  points  . . .  ,£m)  C  Lid  '■=  [0,  l]d.  Let  X[a,b]{')  be  a  characteristic 
function  of  the  interval  [a,  b\.  Denote  for  x,y  G  Lid 


B{x,y)  :=  \\_X[o,Xj](yj)- 

3  = 1 


30 


V.N.TEMLYAKOV 


Then  the  Lp  discrepancy  of  £  :=  {£  V--,r*}c  0,d  is  defined  by 


D(£,  m,  d)p 


-Y,B(X,e)\\L, 
m  ^  p 


(fid 


m=i 


It  will  be  convenient  for  us  to  study  a  slight  modification  of  D(£,m,d)p.  For 
a,f  G  [0, 1]  denote 

H(a,t)  :=  X[o,a](t)  ~  X[a, 


and  for  x,y  G  Qj 


d 


H{x,y)  :=  W_H{xj,yj). 

3  =  1 

We  define  the  symmetrized  Lp  discrepancy  by 


r  1  m 

Ds(Z,m,d)p  :=  ||  /  H{x,y)dy - V  H(x,  ^)\\Lp 

Jnd  rn  ^ 


(fid 


The  Lqo  discrepancies  D(£,m,d) oo  and  Ds (^,m,  d)^  are  defined  in  the  same  way 
with  the  Lp-norm  replaced  by  the  Loo-norm. 

Using  the  identity 


X[o, *,•](%■)  =  +H(xj,yj)) 

we  get  a  simple  inequality 

(5.1)  D(£,m,d)oo  <  Ds d)oo. 

We  are  interested  in  £  with  small  discrepancy.  Consider 

D(m,d)p  :=  inf  L>(£,  m,  d)p,  Ds(m,d)p  :=  inf  Ds(£,  to,  d)p. 

For  1  <  p  <  oo  the  following  relation  is  known  (see  [BC,p.5]) 

(5.2)  D(m,  d)p  x  mT1  (In  ra)(d_1)/2 

with  constants  in  x  depending  on  p  and  d.  The  right  order  of  D(m ,  d)p  for  d  >  3 
is  unknown.  As  we  mentioned  in  the  Introduction  the  following  estimate  has  been 
obtained  in  [HNWW], 

(5.3)  D(m,d) oo  <  Cd1/2m-1/2. 

It  is  pointed  out  in  [HNWW]  that  (5.3)  is  only  an  existence  theorem  and  even  a 
constant  C  in  (5.3)  is  unkown.  Their  proof  is  a  probabilistic  one.  There  are  also 
some  other  estimates  in  [HNWW]  with  explicit  constants.  We  mention  one  of  them 

D(m,d) oo  <  C(d\nd)1/,2((\nn)/n)1//2 


(5.4) 
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with  an  explicit  constant  C.  The  proof  of  (5.4)  is  also  probabilistic. 

In  this  section  we  apply  greedy  type  algorithms  to  obtain  upper  estimates  of 
D(m,d)p,  1  <  p  <  oo  in  a  style  of  (5.3)  and  (5.4).  The  important  feature  of 
our  proof  is  that  it  is  deterministic  and  moreover  it  is  constructive.  Formally  the 
optimization  problem 


D(m,  d)p  = 


inf  D(£,  to,  d) 


p 


is  deterministic:  one  needs  to  minimize  over  , . . .  ,  £m}  C  f Id-  However  it  is  not 
constructive.  It  is  known  (see  [DMA])  that  simultaneous  optimization  over  many 
parameters  ({7, . . .  ,  £m}  in  our  case)  is  a  very  difficult  problem.  We  note  that 


D(m,  d)p  =  aern(J,B)p 


inf  c  JJ(') 


1 

TO 


M=1 


where 


and 


B(x,  y)dy 


B={B(x,y),  yeSld}- 


It  has  been  proved  in  [DMA]  that  if  an  algorithm  finds  best  to- term  approximation 
for  each  /  G  M.N  for  every  dictionary  V  with  the  number  of  elements  of  order 
Nk,  k  >  1,  then  this  algorithm  solves  an  iVP-hard  problem.  Thus,  in  nonlinear  m- 
term  approximation  we  look  for  methods  (algorithms)  which  provide  approximation 
close  to  best  to- term  approximation  and  at  each  step  solve  an  optimization  problem 
over  only  one  parameter  (£M  in  our  case).  In  this  section  we  will  provide  such  an 
algorithm  for  estimating  ae^n[J1B)p.  We  call  this  algorithm  ’’constructive”  because 
it  provides  an  explicit  construction  with  feasible  one  parameter  optimization  steps. 

We  proceed  to  the  construction.  In  this  section  we  do  not  assume  that  a  dic¬ 
tionary  D  is  symmetric:  g  G  V  implies  —g  G  V.  To  indicate  this  we  will  use  the 
notation  V+  for  such  a  dictionary.  We  do  not  assume  that  elements  of  a  dictionary 
V+  are  normalized  (||g||  =  1  if  g  G  V+ )  and  assume  only  that  ||g||  <  1  if  g  G  V+ . 
By  Ai(T>+)  we  denote  the  closure  of  the  convex  hull  of  V+.  We  will  use  in  our 
construction  the  IA(e)  which  is  a  slight  modification  of  the  corresponding  procedure 
from  [DGDS].  For  convenience  we  repeat  here  the  definition  of  the  IA(e)  from  the 
Introduction.  Let  e  =  {en})^,  en  >  0,  n  =  1,  2, . . .  . 

Incremental  Algorithm  with  schedule  e  (IA(e)).  Let  /  G  A\{T>+).  Denote 
/q’£  :=  /  and  GqC  :=  0.  Then  for  each  to  >  1  we  inductively  define 

1.  Am  e  is  any  satisfying 


/)  >  -e 


G^:=(1-1/to)G^1  +  ¥47to. 


2.  Define 
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3.  Denote 

fM  f  _  (?i,£ 

We  note  that  similarly  to  Lemma  2.2  we  have  for  any  bounded  linear  functional 

F  and  any  V+ 

(5.5)  sup  F(g)  =  sup  F(f). 

g£V+  feAi(T>+) 

Therefore,  for  any  F  and  any  /  G  A\(T>+) 

sup  F(g)  >  F(f). 

9&V+ 

This  guarantees  existence  of  (p\£. 

Theorem  5.1.  Let  X  be  a  uniformly  smooth  Banach  space  with  the  modulus  of 
smoothness  p(u )  <  7 uq ,  1  <  q  <  2.  Define 

en  :=  Ki71/qn~l/w,  w  =  — ,  n  =  l,2,.... 

q- 1 

Then  for  any  f  G  A\(T>+)  we  have 

11/711  <  m  =  1,2.... 

Proof.  We  will  use  the  abbreviated  notation  fm  :=  fiff,  (pm  :=  ipl£,  Gm  :=  G 
Representing 

fm  —  fm—  1  (Pm  Frm  \)  /  771 

we  get  immedietly  the  trivial  estimate 

(5.6)  H/mll  <  \\fm-l\\  +  2/m. 

Representing 

(5.7)  fm  =  (1  -  l/m)fm-i  -  ((fm  ~  f)/m  =  (1  -  l/m)(/m_i  -  (( pm  -  f)/(m-  1)) 

we  obtain  in  a  similar  to  (2.10)  or  (3.4)  way 

(5.8) 

ll/m-i-(»’m-/)/(n»-l)||  <  |l/m-ill(l  +2p(2((m  —  l)||/m_1||)  1))  +  em(m-l) 

Using  the  definition  of  em  and  the  assumption  p(u )  <  yw9  we  make  the  following 
observation.  There  exists  a  constant  C(Ki)  such  that  if 

(5.9)  1 1  fm—i  1 1  >  -  1 y1/w 

then 

(5.10)  2p(2((m  -  1)||  fm—  1 1 1 ) ~ 1 )  +em((m  -  l)||/m-i||)~1  <  l/(4m) 
and,  therefore,  by  (5.7)  and  (5.8) 

(5-11)  ||/m||<(l— 3/(4rn))||/m_1||. 

The  following  lemma  is  known  ([T5]). 
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Lemma  5.1.  Let  three  positive  numbers  a  <  7  <  1,  A  >  1  be  given  and  let  a 
sequence  of  positive  numbers  1  >  a\  >  <22  >  . . .  satisfy  the  condition:  if  for  some 
v  £  N  we  have 

av  >  Ah'-01 


then 

a„+ 1  <  a„(l  -  7/n). 

Then  there  exists  B  =  AC(a ,  7)  such  that  for  all  n  =  1,2,...  we  have 


an  <  -Bn  a. 


Remark  5.1.  It  is  easy  to  check  that  the  proof  of  Lemma  5.1  from  [T5]  works  if 
we  replace  the  assumption  am  <  am- 1  by 

Q'm  Q'm— 1  C(m  1) 

Taking  into  account  (5.6)  we  apply  Lemma  5.1  and  Remark  5.1  to  the  sequence 
an  =  1 1  fn  1 1 ,  n  =  1,  2, . . .  with  a  =  1/w,  7  =  3/4  and  complete  the  proof  of  Theorem 

5.1. 

Corollary  5.1.  We  apply  Theorem  5.1  for  X  =  Lp(Lld),  p  £  [2, 00),  T>+  = 
{H(x,y),y  £  Lld},  f  =  Js(x),  where 

Js(x)=  f  H(x,y)dyeA1(V+). 

J  Qd 

Using  (1.5)  we  get  by  Theorem  5.1  a  constructive  set  . . .  ,£m  such  that 

with  absolute  constant  C. 

Corollary  5.2.  We  apply  Theorem  5.1  for  X  =  Lp(Lld),  P  £  [2, 00),  V+  = 
{B(x,y),y  £  Ld},  f  =  J{x),  where 

J{x)=f  B(x,y)dy  £  Ai{V+). 

J  Qd 

Using  (1.5)  we  get  by  Theorem  5.1  a  constructive  set  £1, . . .  ,  such  that 
D(£,  m,  d)p  =  \\WAe\\Lp(Qd)  <  Cp1/2m~1/2 


with  absolute  constant  C. 
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Corollary  5.3.  We  apply  Theorem  5.1  for  X  =  Lp(Qd),  P  G  [2,oo),  V+  = 
{B(x,y)/\\B(-,y)\\Lp(nd),  V  e  /  =  J(x).  Using  (1.5)  we  get  by  Theorem 
5.1  a  constructive  set  £1, . . . ,  such  that 

„  1  m  d 

||  /  B(x,y)dy- -Y,{-^)d(Y[^-^r1/p)B(x,^)\\Lp(nd)  < 

Jq<*  m  n=i  1  l=i 

c(^)y/2m-i/2 
kp+ i 

with  absolute  constant  C. 

We  note  that  in  the  case  X  =  Lp( Qd),  P  G  [2,oo),  T>+  =  {H(x,y),x  G  O^}, 
/  =  Js(y )  the  implementation  of  the  IA(e)  is  a  sequence  of  maximization  steps 
when  we  maximize  functions  of  d  variables.  An  important  advantage  of  the  Lp 
spaces  is  a  simple  and  explicit  form  of  the  norming  functional  Ff  of  a  function 
f  G  Lp(Uld).  The  Ff  acts  as  (for  real  Lp  spaces) 

Ff(g)  =  [  \\f\\l-p\fr2fgdy. 

J  fld 

Thus  the  IA(e)  should  find  at  a  step  m  an  approximate  solution  to  the  following 
optimization  problem  (over  y  G  f Id) 

f  \fm-l(X)\P~2fm-l(X)H(X^y)dx  -»■  maX' 

J 

Let  us  discuss  possible  application  of  the  AWRGA  instead  of  the  IA(e).  An 
obvious  change  is  that  instead  of  cubature  formula 

m 

-TH(x,n 

n=i 

in  the  case  of  IA(e)  we  have  a  cubature  formula 

m  m 

/^=1  u=i 

in  the  case  of  the  AWRGA.  It  is  a  disadvantage  of  the  AWRGA.  An  advantage  of 
the  AWRGA  is  that  we  are  more  flexible  in  selecting  an  element  pff: 

C-l(C  -  Gm-l)  >  tm  sup  FZ^ig  -  GZ_X) 

g£V 

than  an  element 

F,i..  (fZ  -  /)  >  -«m. 

We  will  now  derive  an  estimate  for  D(m,d) oo  from  Corollary  5.2. 
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Proposition  5.1.  For  any  m  there  exists  a  constructive  set  £  =  -j^1, . . . ,  £m}  C 
such  that 

(5.12)  to,  d)oo  <  Cd3/2(m&x(lnd,  lnm))^2m-1^2,  d,m>  2 

with  effective  absolute  constant  C. 

Proof.  We  use  the  inequality  from  [NTT] 

(5.13)  D(f,  to,  d) oo  <  c(d,p)d(3d  +  4 )£>(£,  to,  ^)p/(p+<i) 
and  the  estimate  for  c(d,p)  from  [HNWW] 

(5.14)  c(d,p )  <  31/3cT1+2/(1+p/d). 

Specifying  p  =  dmax(lnd,  In  to)  and  using  Corollary  5.2  we  get  (5.12)  from  (5.13) 
and  (5.14). 
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